# Demo of how to convert values with string suffixes to numeric
x <- c("12k", "11k", "8900", "8400", "11k", "10k")
x %>%
str_replace_all("k", "e3") %>%
str_replace_all("M", "e6") %>%
as.numeric()[1] 12000 11000 8900 8400 11000 10000
This project is a culmination of the skills and knowledge you’ve gained throughout the course. Your task is to create an interactive dashboard published to GitHub Pages, showcasing global statistics in a visually engaging way. You will choose one or several datasets on global indicators from Gapminder or other sources, clean and prepare the data, and create a multi-page dashboard with interactive visualizations.
Choose a dataset on global indicators from Gapminder’s data repository. Gapminder provides a vast repository of country-level statistics on a wide range of topics, including but not limited to:
The data is available to download in CSV files which you can then import, clean, and analyze.
Visit the Gapminder’s data repository and browse the indicators from the dropdown menu. You can use the search box to look for topics of interest.

Preview the spreadsheet and read more about the indicator. Ensure the dataset is relatively complete with minimal missing entries, especially in recent years. You can click on the icons in “VIEW AS: 🎈〽️” to visualize the data as a bubble plot or lin

You can also create exploratory visualizations of any indicator with Gapminder tools. By default, it shows bubble plot of Life expectancy vs. GDP per capita, sized by Population, but you can customize the plot by choosing different indicators or different types of visualizations

You can choose different plot types from the dropdown menu at the top left of the page. Maps, Trends, and Ranks are particularly useful visualizations.

Here are a few steps and considerations for selecting your data:
Comprehensiveness: Check for data completeness. Ideally should have at least 10 years with minimal missing entries.
Relevance: Choose data that is up-to-date. Avoid datasets with outdated statistics (e.g., malaria case data is only recorded until 2006).
Relationships: While we require you to choose only one indicator, consider analyzing relationships between two or more indicators. For example, you could compare trends in sanitation levels with child mortality, or how TB incidence correlates with HIV incidence.
Choose indicators that appeal to you and download the data in CSV format using the “DOWNLOAD AS: ⏬CSV” option.
Gapminder also provides country metadata here. This contains useful variables you may want to join with your indicator dataset. Key variables of interest might be:
Country codes: Standardized 3-letter country codes, same as ISO. Useful for joining with other datasets.
Regions: Geographic divisions to group and summarize by. Useful for comparing indicators across continents.
Income groups: Can be converted to an ordered factor variable for visualizing relationships between income level your indicator.
While Gapminder is recommended for its ease of access and breadth of indicators, you are welcome to to explore other datasets that might offer richer details. For example, the WHO World Malaria Report provides data on malaria cases and deaths from 2010-2022, available for download as Excel files.
Other websites to explore are:
These sources may require more complex data cleaning processes.
Gapminder indicator datasets are provided in a wide format with one row per country and columns representing years. You will need to pivot this data from wide to long format for easier filtering, grouping, and plotting.
Convert string representations of numbers (like “20k” or “2M”) into actual numeric values using the {stringr} package.
# Demo of how to convert values with string suffixes to numeric
x <- c("12k", "11k", "8900", "8400", "11k", "10k")
x %>%
str_replace_all("k", "e3") %>%
str_replace_all("M", "e6") %>%
as.numeric()[1] 12000 11000 8900 8400 11000 10000
Use the {countrycode} package to align country names with their ISO codes, ensuring accurate merges with geospatial data.
Alternatively, you can join your Gapminder indicator data with the Gapminder geographic metadata by country name, and use the geo column as the ISO code. The country names in these two datasets will match exactly, since they are both compiled by Gapminder.
Use the {rnaturalearth} package to download the country polygons. This data provides the geographical shapes necessary for plotting your world map.
After aligning the country names or ISO codes between your datasets, merge the Gapminder data with the country polygons.
Organize your project repository as follows:
_.Rproj: Rstudio project file.
_.qmd: Main project dashboard.
_.html: Rendered HTML dashboard.
/data: Data folder
/data/*.csv: Your dataset(s) in CSV format
/data/README.md: Metadata about your dataset including information on provenance, codebook, variable definitions, etc.
/images: Images folder
Create your Quarto project and choose appropriate document options, defining the title and author for the navigation bar as well as specifying the use of the dashboard format.
Optionally, you can also include a logo and one or more nav-buttons.
---
title: "DASHBOARD TITLE"
author: "YOUR NAME"
format:
dashboard:
logo: images/LOGO_IMAGE.png
nav-buttons: [github]
github: https://github.com/YOUR_URL
theme: lux
execute:
echo: false
warning: false
message: false
---
Set up your environment with the required libraries:
# Load packages
if(!require(pacman)) install.packages("pacman")
pacman::p_load(tidyverse,
here,
sf,
bslib,
bsicons,
rnaturalearth,
plotly,
countrycode,
htmltools,
reactable,
janitor
)Create a multi-page dashboard using Quarto. Divide each page into sections using headings, and organize your content into rows and/or columns to create a visually appealing layout.
Requirements:
Your dashboard should not only be informative but also engaging and easy to navigate. Incorporate various Quarto features to enhance user experience, such as:
Statistical Highlights: Use value boxes to display key statistics, such as the highest and lowest values for the selected indicator, or significant year-on-year changes. Highlight interesting geographical trends, such as a country that deviates significantly from regional norms.
Professional Aesthetics: Employ custom themes and color palettes to make the visual presentation as professional as possible.
World Map: Use {ggplot2} and the ggploty() function from {plotly} to create an interactive choropleth or dot map that allows users to explore your chosen indicator by country and year.
Additional Visualizations: At least two other interactive charts or tables, such as scatter plots of GDP vs. health indicators or box plots segmented by continent.
Customize visual aesthetics significantly beyond the defaults to ensure a professional appearance. Each visualization should be accompanied by a descriptive title, clear legends, and annotated axes.
Yearly Data Interaction: Consider implementing a slider to allow viewers to see changes over time on the map. You can do this by adding a frame aesthetic to ggplot, and allows interactive, linked views of a series of frames over time.
gg <- gapminder::gapminder %>%
ggplot(aes(x = gdpPercap, y = lifeExp, color = continent,
frame = year)) +
geom_point() +
scale_x_log10() +
theme_minimal()
ggplotly(gg)You should deploy your dashboard to GitHub Pages for easy access and sharing. Consult the lesson on Deploying Dashboards with Quarto for instructions on how to set up your GitHub repository and deploy your dashboard.
Submit a ZIP file of your repository and a link to the deployed dashboard on GitHub Pages through the course webpage by July 12, 2024.